CHOOSE 


A PROPERTY 


FOR 


MODELING 



CLASSIFY A SET OF MOLECULES 
BASED ON THE PRESENCE OR 
ABSENCE OF THE PROPERTY 



SELECT A SUBSET OF MOLECULES 
EXHIBITING THE PROPERTY 
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SELECT ONE OR MORE MARKER 
MOLECULES FROM THE SUBSET 
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FIG. 1 



FIG. 2 



COMPARE ALL MOLECULES 
IN A TRAINING SET OF N MOLECULES 
TO EACH OTHER ACCORDING TO 
A STRUCTURAL SIMILARITY METRIC 
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SELECT A DTC MOLECULE 
FROM THE TRAINING SET 
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SORT ALL OTHER MOLECULES IN 
THE TRAINING SET IN DESCENDING 
ORDER OF STRUCTURAL SIMILARITY 
TO THE SELECTED DTC MOLECULE 
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COMPUTE A FRACTION CORRECTLY 
PREDICTED METRIC FOR EACH OF THE 
SORTED TRAINING MOLECULES 
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SELECT ONE OR MORE FRACTION CORRECTLY 
PREDICTED THRESHOLDS 



COUNT THE NUMBER OF MOLECULES 
AWAY FROM THE DTC MOLECULE AT WHICH 

THE ACTUAL FRACTION CORRECTLY 
PREDICTED DROPS BELOW THE THRESHOLDS 




CHOOSE AS MARKER MOLECULES 
THOSE DTC MOLECULES FOR WHICH 
THE FRACTION CORRECTLY PREDICTED 
EXCEEDS A THRESHOLD FOR A 
SELECTED MINIMUM NUMBER OF MOLECULES 



SELECT MOLECULE WITH 
UNKNOWN PROTEIN BINDING 
CHARACTERISTICS 






COMPARE ST 
A MARKER MC 
HIGH PROTE 


RUCTURE TO 
DLECULE WITH 
UN BINDING 
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CLASSIFY 


UNKNOWN 


MOLECULE 


AS HIGHLY 


PROTEIN 


BOUND 
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COMPARE STRUCTURE TO 
ANOTHER MARKER MOLECULE 
WITH HIGH PROTEIN BINDING 



CLASSIFY MOLECULE 
AS NOT HIGHLY 
PROTEIN BOUND 



FIG. 3 



SELECT A RANGE FOR MINIMUM 
MOLCNT DISTANCE AND RANGE 
OF MINIMUM FCP THRESHOLDS 



DETERMINE A SET OF MARKER 
MOLECULES FOR EACH COMBINATION 
OF MOLCNT AND FCP WITHIN 
THE RANGES 



CLASSIFY ALL TRAINING SET COMPOUNDS 
USING EACH SET OF MARKER MOLECULES 
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COMPARE FRACTIONS CORRECTLY 
CLASSIFIED FOR TRAINING SET COMPOUNDS 



SELECT AS A FINAL SET OF MARKER 
MOLECULES THE SET THAT HAS THE BEST 
PREDICTIVE ABILITY FOR ALL THE 
TRAINING SET COMPOUNDS 
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FIG. 4 




FIG. 5 



